VALEVAL: Testing Vallex Consistency and Experimenting with Word-Frame Disambiguation

نویسندگان

  • Ondrej Bojar
  • Jirí Semecký
  • Václava Kettnerová
چکیده

VALLEX is a valency lexicon of Czech verbs. We briefly introduce VALLEX and then describe and evaluate the VALEVAL experiment: annotation of 10256 corpus instances of 109 Czech verbs with valency frames. The inter-annotator agreement of three parallel annotations ranges from 61% to 74% and κ from 0.52 to 0.62. More than 8000 sentences are now available as the “golden VALEVAL” for word-sense disambiguation experiments. In out first attempts using morphological and syntax-based information, we achieve the accuracy of 70% to 80%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Valency Lexicon of Czech Verbs VALLEX: Recent Experiments with Frame Disambiguation

VALLEX is a linguistically annotated lexicon aiming at a description of syntactic information which is supposed to be useful for NLP. The lexicon contains roughly 2500 manually annotated Czech verbs with over 6000 valency frames (summer 2005). In this paper we introduce VALLEX and describe an experiment where VALLEX frames were assigned to 10,000 corpus instances of 100 Czech verbs – the pairwi...

متن کامل

Verbal Valency Frame Detection and Selection in Czech and English

We present a supervised learning method for verbal valency frame detection and selection, i.e., a specific kind of word sense disambiguation for verbs based on subcategorization information, which amounts to detecting mentions of events in text. We use the rich dependency annotation present in the Prague Dependency Treebanks for Czech and English, taking advantage of several analysis tools (tag...

متن کامل

On Automatic Assignment of Verb Valency Frames in Czech

Many recent NLP applications, including machine translation and information retrieval, could benefit from semantic analysis of language data on the sentence level. This paper presents a method for automatic disambiguation of verb valency frames on Czech data. For each verb occurrence, we extracted features describing its local context. We experimented with diverse types of features, including m...

متن کامل

Extensive Study on Automatic Verb Sense Disambiguation in Czech

In this paper we compare automatic methods for disambiguation of verb senses, in particular we investigate Näıve Bayes classifier, decision trees, and a rule-based method. Different types of features are proposed, including morphological, syntax-based, idiomatic, animacy, and WordNet-based features. We evaluate the methods together with individual feature types on two essentially different Czec...

متن کامل

Verb Valency Frames Disambiguation: Dissertation Summary

is is a summary of the author’s PhD dissertation defended on September 17, 2007 at the Faculty of Mathematics and Physics, Charles University in Prague. Semantic analysis has become a bottleneck of many natural language applications. Machine translation, automatic question answering, dialog management, and others rely on high quality semantic analysis. Verbs are central elements of clauses wit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Prague Bull. Math. Linguistics

دوره 83  شماره 

صفحات  -

تاریخ انتشار 2005